Skip to content

Mitigate race condition in network caching #299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

juleskers
Copy link

@juleskers juleskers commented May 27, 2025

Hey, you may remember me from #291

Gitnuro is still fascinating me.
I've since downloaded the sourcecode to have a go at the AvatarProviders.
I really like the refactoring you did since the original "useGravatar" boolean. Good job!

While toying with it, I noticed the gravatar provider does multiple network requests, for the same images, which surprised me, given the InMemoryCache.

I've documented my findings in the commits.
Tldr: the NetworkImageLoader can trigger multiple times for the same url/image, if a series of commits, by the same author, is processed before the first image is finished loading from the web.

It's bed time for me now but I'll check back tomorrow.

Jules Kerssemakers added 2 commits May 27, 2025 23:27
Users deserve to know when their software is doing network access.
Additionally, makes it easier for us developers to verify if cache works.
…ed repeatedly from network.

There's a race condition in the NetworkImageLoader / InMemoryImagesCache.
Since image loading is done async, the UI will queue up the multiple commits while the first image is still
streaming in from the net.
If the next commit(s) is (are) from the same author, this will lead to a cache-miss (previous fetch isn't yet
finished, thus not yet cached), which queues up ANOTHER network-fetch.

This is a classic footgun in async code, going all the way back to java 1.0 with `synchronized`.

To mitigate: do a Double-checked locking: retest cache *inside* the semaphore critical section.
Even then, reduce the semaphore-count to 1;
Any higher will lead to the critical section being executed multiple times in parallel, which
means the retry will *still* race with itself in a parallel execution.

A proper fix will require smarter locking, involving URL-specific locks.
(probably something with concurrent maps containing futures?)

See also:
https://en.wikipedia.org/wiki/Double-checked_locking
@JetpackDuba
Copy link
Owner

Hello!

Yes, I'm fully aware of it, but I didn't want to spend much time on it for now, as I'm planning a "big cleanup" for v1.6 to fix all the small details like this one.

Looking at the changes, I'm not sure it's a real improvement. While it saves some requests from running, it limits of many images can be loaded in parallel, delaying the currently displayed avatars when scrolling fast. Not a big deal either but I don't really see it as an improvement.

@JetpackDuba
Copy link
Owner

JetpackDuba commented Jun 1, 2025

Probably something similar to this would work out better.

@juleskers
Copy link
Author

Hey, thanks for the feedback!

I get time being slim, but this is being a bad net-citizen. Such request-bombings is a good way to get your user agent string blocked. (does Gitnuro actually have its own user agent string? I haven't checked yet..)

Your "reduction of requests" branch looks like the approach I wanted to work towards, but I need to learn more Kotlin first 😅. I'm a professional java de, so eat JVM for breakfast, but Kotlin is still kicking my ass with how convenient-but-different everything is.

If you want to free up your own time, I can take over your reduction-branch and iterate on it. (There's a few synchronisation issues still that I recognise from Java, despite your use of synchronised).

@JetpackDuba
Copy link
Owner

Sure, go ahead with it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants